{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Keras Functional API"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Recall: All models (layers) are callables"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "```python\n",
    "from keras.layers import Input, Dense\n",
    "from keras.models import Model\n",
    "\n",
    "# this returns a tensor\n",
    "inputs = Input(shape=(784,))\n",
    "\n",
    "# a layer instance is callable on a tensor, and returns a tensor\n",
    "x = Dense(64, activation='relu')(inputs)\n",
    "x = Dense(64, activation='relu')(x)\n",
    "predictions = Dense(10, activation='softmax')(x)\n",
    "\n",
    "# this creates a model that includes\n",
    "# the Input layer and three Dense layers\n",
    "model = Model(inputs=inputs, outputs=predictions)\n",
    "model.compile(optimizer='rmsprop',\n",
    "              loss='categorical_crossentropy',\n",
    "              metrics=['accuracy'])\n",
    "model.fit(data, labels)  # starts training\n",
    "```"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Multi-Input Networks"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Keras Merge Layer"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Here's a good use case for the functional API: models with multiple inputs and outputs. \n",
    "\n",
    "The functional API makes it easy to manipulate a large number of intertwined datastreams.\n",
    "\n",
    "Let's consider the following model. "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "```python\n",
    "from keras.layers import Dense, Input\n",
    "from keras.models import Model\n",
    "from keras.layers.merge import concatenate\n",
    "\n",
    "left_input = Input(shape=(784, ), name='left_input')\n",
    "left_branch = Dense(32, input_dim=784, name='left_branch')(left_input)\n",
    "\n",
    "right_input = Input(shape=(784,), name='right_input')\n",
    "right_branch = Dense(32, input_dim=784, name='right_branch')(right_input)\n",
    "\n",
    "x = concatenate([left_branch, right_branch])\n",
    "predictions = Dense(10, activation='softmax', name='main_output')(x)\n",
    "\n",
    "model = Model(inputs=[left_input, right_input], outputs=predictions)\n",
    "```"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Resulting Model will look like the following network:"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<img src=\"../imgs/multi_input_model.png\" />"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Such a two-branch model can then be trained via e.g.:"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "```python\n",
    "model.compile(optimizer='rmsprop', loss='categorical_crossentropy', metrics=['accuracy'])\n",
    "model.fit([input_data_1, input_data_2], targets)  # we pass one data array per model input\n",
    "```"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Try yourself"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Step 1: Get Data - MNIST"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "# let's load MNIST data as we did in the exercise on MNIST with FC Nets"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "# %load ../solutions/sol_821.py"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Step 2: Create the Multi-Input Network"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "## try yourself\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "## `evaluate` the model on test data"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Keras supports different Merge strategies:\n",
    "\n",
    "* `add`: element-wise sum\n",
    "* `concatenate`: tensor concatenation. You can specify the concatenation axis via the argument concat_axis.\n",
    "* `multiply`: element-wise multiplication\n",
    "* `average`: tensor average\n",
    "* `maximum`: element-wise maximum of the inputs.\n",
    "* `dot`: dot product. You can specify which axes to reduce along via the argument dot_axes. You can also specify applying any normalisation. In that case, the output of the dot product is the cosine proximity between the two samples."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "You can also pass a function as the mode argument, allowing for arbitrary transformations:"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "```python\n",
    "merged = Merge([left_branch, right_branch], mode=lambda x: x[0] - x[1])\n",
    "```"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "---"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Even more interesting\n",
    "\n",
    "Here's a good use case for the functional API: models with multiple inputs and outputs. \n",
    "\n",
    "The functional API makes it easy to manipulate a large number of intertwined datastreams.\n",
    "\n",
    "Let's consider the following model (from: [https://keras.io/getting-started/functional-api-guide/](https://keras.io/getting-started/functional-api-guide/) )"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Problem and Data\n",
    "\n",
    "We seek to predict how many retweets and likes a news headline will receive on Twitter. \n",
    "\n",
    "The main input to the model will be the headline itself, as a sequence of words, but to spice things up, our model will also have an auxiliary input, receiving extra data such as the time of day when the headline was posted, etc. \n",
    "\n",
    "The model will also be supervised via two loss functions. \n",
    "\n",
    "Using the main loss function earlier in a model is a good regularization mechanism for deep models.\n",
    "\n",
    "<img src=\"https://s3.amazonaws.com/keras.io/img/multi-input-multi-output-graph.png\" width=\"40%\" />"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Using TensorFlow backend.\n"
     ]
    }
   ],
   "source": [
    "from keras.layers import Input, Embedding, LSTM, Dense\n",
    "from keras.models import Model\n",
    "\n",
    "# Headline input: meant to receive sequences of 100 integers, between 1 and 10000.\n",
    "# Note that we can name any layer by passing it a \"name\" argument.\n",
    "main_input = Input(shape=(100,), dtype='int32', name='main_input')\n",
    "\n",
    "# This embedding layer will encode the input sequence\n",
    "# into a sequence of dense 512-dimensional vectors.\n",
    "x = Embedding(output_dim=512, input_dim=10000, input_length=100)(main_input)\n",
    "\n",
    "# A LSTM will transform the vector sequence into a single vector,\n",
    "# containing information about the entire sequence\n",
    "lstm_out = LSTM(32)(x)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Here we insert the auxiliary loss, allowing the LSTM and Embedding layer to be trained smoothly even though the main loss will be much higher in the model."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "auxiliary_output = Dense(1, activation='sigmoid', name='aux_output')(lstm_out)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "At this point, we feed into the model our auxiliary input data by concatenating it with the LSTM output:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "from keras.layers import concatenate\n",
    "\n",
    "auxiliary_input = Input(shape=(5,), name='aux_input')\n",
    "x = concatenate([lstm_out, auxiliary_input])\n",
    "\n",
    "# We stack a deep densely-connected network on top\n",
    "x = Dense(64, activation='relu')(x)\n",
    "x = Dense(64, activation='relu')(x)\n",
    "x = Dense(64, activation='relu')(x)\n",
    "\n",
    "# And finally we add the main logistic regression layer\n",
    "main_output = Dense(1, activation='sigmoid', name='main_output')(x)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Model Definition"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "model = Model(inputs=[main_input, auxiliary_input], outputs=[main_output, auxiliary_output])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We compile the model and assign a weight of 0.2 to the auxiliary loss. \n",
    "\n",
    "To specify different **loss_weights or loss** for each different output, you can use a list or a dictionary. Here we pass a single loss as the loss argument, so the same loss will be used on all outputs."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Note: \n",
    "Since our inputs and outputs are named (we passed them a \"name\" argument), \n",
    "We can compile&fit the model via:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "model.compile(optimizer='rmsprop',\n",
    "              loss={'main_output': 'binary_crossentropy', 'aux_output': 'binary_crossentropy'},\n",
    "              loss_weights={'main_output': 1., 'aux_output': 0.2})"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "```python\n",
    "\n",
    "# And trained it via:\n",
    "model.fit({'main_input': headline_data, 'aux_input': additional_data},\n",
    "          {'main_output': labels, 'aux_output': labels},\n",
    "          epochs=50, batch_size=32)\n",
    "```"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.5.3"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
